Search CORE

15 research outputs found

Identifying health status of wind turbines by using self organizing maps and interpretation-oriented post-processing tools

Author: Blanco Alejandro
Cusidó Roura Jordi
Gibert Karina
Marti Puig Pere
Sole Casals Jordi
Publication venue: 'MDPI AG'
Publication date: 01/01/2018
Field of study

Identifying the health status of wind turbines becomes critical to reduce the impact of failures on generation costs (between 25–35%). This is a time-consuming task since a human expert has to explore turbines individually. Methods: To optimize this process, we present a strategy based on Self Organizing Maps, clustering and a further grouping of turbines based on the centroids of their SOM clusters, generating groups of turbines that have similar behavior for subsystem failure. The human expert can diagnose the wind farm health by the analysis of a small each group sample. By introducing post-processing tools like Class panel graphs and Traffic lights panels, the conceptualization of the clusters is enhanced, providing additional information of what kind of real scenarios the clusters point out contributing to a better diagnosis. Results: The proposed approach has been tested in real wind farms with different characteristics (number of wind turbines, manufacturers, power, type of sensors, ...) and compared with classical clustering. Conclusions: Experimental results show that the states healthy, unhealthy and intermediate have been detected. Besides, the operational modes identified for each wind turbine overcome those obtained with classical clustering techniques capturing the intrinsic stationarity of the data.Peer ReviewedPostprint (published version

Multidisciplinary Digital Publishing Institute

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Learning from Incomplete Features by Simultaneous Training of Neural Networks and Sparse Coding

Author: Caiafa César Federico
Sole Casals Jordi
Wang Ziyao
Zhao Qibin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

In this paper, the problem of training a classifier on a dataset with incomplete features is addressed. We assume that different subsets of features (random or structured) are available at each data instance. This situation typically occurs in the applications when not all the features are collected for every data sample. A new supervised learning method is developed to train a general classifier, such as a logistic regression or a deep neural network, using only a subset of features per sample, while assuming sparse representations of data vectors on an unknown dictionary. Sufficient conditions are identified, such that, if it is possible to train a classifier on incomplete observations so that their reconstructions are well separated by a hyperplane, then the same classifier also correctly separates the original (unobserved) data samples. Extensive simulation results on synthetic and well-known datasets are presented that validate our theoretical findings and demonstrate the effectiveness of the proposed method compared to traditional data imputation approaches and one state-of-the-art algorithm.Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Wang, Ziyao. South East University; ChinaFil: Sole Casals, Jordi. University of Vic; EspañaFil: Zhao, Qibin. Center for Advanced Intelligence Project; JapónIEEE Computer Society Conference on Computer Vision and Pattern Recognition 2021New YorkEstados UnidosIEE

CONICET Digital

Gene filtering with optimal threshold selection

Author: Bau Macia Josep
Caiafa César Federico
Lew Sergio Eduardo
Sole Casals Jordi
Publication venue: Universidad Autónoma de Barcelona
Publication date: 01/01/2012
Field of study

Gene filtering is a useful preprocessing technique often applied to microarray datasets. However, it is no common practice because clear guidelines are lacking and it bears the risk of excluding some potentially relevant genes. In this work, we propose to model microarray data as a mixture of two Gaussian distributions that will allow us to obtain an optimal filter threshold in terms of the gene expression level.Fil: Bau Macia, Josep. Universidad de Vic; EspañaFil: Sole Casals, Jordi. Universidad de Vic; EspañaFil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Lew, Sergio Eduardo. Universidad de Buenos Aires. Facultad de Ingeniería. Departamento de Electronica; ArgentinaThe Barcelona International Conference on Advances in StatisticsBarcelonaEspañaUniversidad Autónoma de Barcelon

CONICET Digital

Decomposition methods for machine learning with small, incomplete or noisy datasets

Author: Caiafa César Federico
Marti Puig Pere
Sole Casals Jordi
Sun Zhe
Tanaka Toshihisa
Publication venue: 'MDPI AG'
Publication date: 01/11/2020
Field of study

In many machine learning applications, measurements are sometimes incomplete or noisy resulting in missing features. In other cases, and for different reasons, the datasets are originally small, and therefore, more data samples are required to derive useful supervised or unsupervised classification methods. Correct handling of incomplete, noisy or small datasets in machine learning is a fundamental and classic challenge. In this article, we provide a unified review of recently proposed methods based on signal decomposition for missing features imputation (data completion), classification of noisy samples and artificial generation of new data samples (data augmentation). We illustrate the application of these signal decomposition methods in diverse selected practical machine learning examples including: brain computer interface, epileptic intracranial electroencephalogram signals classification, face recognition/verification and water networks data analysis. We show that a signal decomposition approach can provide valuable tools to improve machine learning performance with low quality datasets.Fil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Sole Casals, Jordi. Center for Advanced Intelligence; JapónFil: Marti Puig, Pere. University of Catalonia; EspañaFil: Sun, Zhe. RIKEN; JapónFil: Tanaka,Toshihisa. Tokyo University of Agriculture and Technology; Japó

CONICET Digital

Detection of Wind Turbine Failures through Cross-Information between Neighbouring Turbines

Author: Caiafa Cesar F.
Cusidó Roura Jordi
Lozano Francico J.
Marti Puig Pere
Serra Serra Moisès
Sole Casals Jordi
Publication venue: 'MDPI AG'
Publication date: 01/09/2022
Field of study

In this paper, the time variation of signals from several SCADA systems of geographically closed turbines are analysed and compared. When operating correctly, they show a clear pattern of joint variation. However, the presence of a failure in one of the turbines causes the signals from the faulty turbine to decouple from the pattern. From this information, SCADA data is used to determine, firstly, how to derive reference signals describing this pattern and, secondly, to compare the evolution of different turbines with respect to this joint variation. This makes it possible to determine whether the behaviour of the assembly is correct, because they maintain the well-functioning patterns, or whether they are decoupled. The presented strategy is very effective and can provide important support for decision making in turbine maintenance and, in the near future, to improve the classification of signals for training supervised normality models. In addition to being a very effective system, it is a low computational cost strategy, which can add great value to the SCADA data systems present in wind farms.Peer ReviewedObjectius de Desenvolupament Sostenible::7 - Energia Assequible i No Contaminant::7.a - Per a 2030, augmentar la cooperació internacional per tal de facilitar l’accés a la investigació i a les tecnologies energètiques no contaminants, incloses les fonts d’energia renovables, l’eficiència energètica i les tecnologies de combustibles fòssils avançades i menys contaminants, i promoure la inversió en infraestructures energètiques i tecnologies d’energia no contaminantObjectius de Desenvolupament Sostenible::7 - Energia Assequible i No Contaminant::7.b - Per a 2030, ampliar la infraestructura i millorar la tecnologia per tal d’oferir serveis d’energia moderns i sostenibles per a tots els països en desenvolupament, en particular els països menys avançats, els petits estats insulars en desenvolupament i els països en desenvolupament sense litoral, d’acord amb els programes de suport respectiusObjectius de Desenvolupament Sostenible::7 - Energia Assequible i No ContaminantPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

CONICET Digital

Directory of Open Access Journals

Servicio de Difusión de la Creación Intelectual

Serial-EMD: Fast Empirical Mode Decomposition Method for Multi-dimensional Signals Based on Serialization

Author: Caiafa César Federico
Duan Feng
Feng Fan
Marti Puig Pere
Sole Casals Jordi
Sun Zhe
Zhang Jin
Publication venue: 'Elsevier BV'
Publication date: 01/09/2021
Field of study

Empirical mode decomposition (EMD) has developed into a prominent tool for adaptive, scale-based signal analysis in various fields like robotics, security and biomedical engineering. Since the dramatic increase in amount of data puts forward higher requirements for the capability of real-time signal analysis, it is difficult for existing EMD and its variants to trade off the growth of data dimension and the speed of signal analysis. In order to decompose multi-dimensional signals at a faster speed, we present a novel signal-serialization method (serial-EMD), which concatenates multi-variate or multi-dimensional signals into a one-dimensional signal and uses various one-dimensional EMD algorithms to decompose it. To verify the effects of the proposed method, synthetic multi-variate time series, artificial 2D images with various textures and real-world facial images are tested. Compared with existing multi-EMD algorithms, the decomposition time becomes significantly reduced. In addition, the results of facial recognition with Intrinsic Mode Functions (IMFs) extracted using our method can achieve a higher accuracy than those obtained by existing multi-EMD algorithms, which demonstrates the superior performance of our method in terms of the quality of IMFs. Furthermore, this method can provide a new perspective to optimize the existing EMD algorithms, that is, transforming the structure of the input signal rather than being constrained by developing envelope computation techniques or signal decomposition methods. In summary, the study suggests that the serial-EMD technique is a highly competitive and fast alternative for multi-dimensional signal analysis.Fil: Zhang, Jin. Nankai University; ChinaFil: Feng, Fan. Nankai University; ChinaFil: Marti Puig, Pere. Central University of Catalonia; EspañaFil: Caiafa, César Federico. Provincia de Buenos Aires. Gobernación. Comisión de Investigaciones Científicas. Instituto Argentino de Radioastronomía. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - La Plata. Instituto Argentino de Radioastronomía; ArgentinaFil: Sun, Zhe. RIKEN; JapónFil: Duan, Feng. Nankai University; ChinaFil: Sole Casals, Jordi. Central University of Catalonia; Españ

CONICET Digital

Maximum likelihood Linear Programming Data Fusion for Speaker Recognition

Author: Atal
Bellings
Bertsimas
Bishop
Boer
Boyd
Cover
Enric Monte-Moreno
Faundez-Zanuy
Faundez-Zanuy
Faundez-Zanuy
Faundez-Zanuy
Faundez-Zanuy
Faundez-Zanuy
Jacoviti
Jordi Sole-Casals
Kuncheva
Mahadeva Prasanna
Marcos Faundez-Zanuy
Mohamed Chetouani
Nikias
Nikias
Ortega-García
Prakriya
Reynolds
Solé-Casals
Solé-Casals
Solé-Casals
Taleb
Taleb
Thevenaz
Zheng
Publication venue: 'Elsevier BV'
Publication date: 01/01/2009
Field of study

Biometric system performance can be improved by means of data fusion. Several kinds of information can be fused in order to obtain a more accurate classification (identification or verification) of an input sample. In this paper we present a method for computing the weights in a weighted sum fusion for score combinations, by means of a likelihood model. The maximum likelihood estimation is set as a linear programming problem. The scores are derived from a GMM classifier working on a different feature extractor. Our experimental results assesed the robustness of the system in front a changes on time (different sessions) and robustness in front a change of microphone. The improvements obtained were significantly better (error bars of two standard deviations) than a uniform weighted sum or a uniform weighted product or the best single classifier. The proposed method scales computationaly with the number of scores to be fussioned as the simplex method for linear programming

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

RIUVic

Recommended from our members

Seizure onset zone classification based on imbalanced iEEG with data augmentation.

Author: Sole-Casals Jordi
Sugano Hidenori
Tanaka Toshihisa
Zhao Xuyang
Publication venue: J Neural Eng
Publication date: 18/11/2022
Field of study

Objective. Identifying the seizure onset zone (SOZ) in patients with focal epilepsy is the critical information required for surgery. However, collecting this information is challenging, time-consuming, and subjective. Some machine learning methods reduce the workload of clinical experts in intracranial electroencephalogram (iEEG) visual diagnosis but face significant challenges because interictal iEEG clinical data often suffer from a significant class imbalance. We aim to generate synthetic data for the minority class.Approach. To make the clinically imbalanced data suitable for machine learning, we introduce an EEG augmentation method (EEGAug). The EEGAug method randomly selects several samples from the minority class and transforms them into the frequency domain. Then, different frequency bands from different samples are used to compose new data. Finally, a synthetic sample is generated after converting the new data back to the time domain.Main results. The imbalanced clinical iEEG data can be balanced and applied to machine learning models using the method. A one-dimensional convolutional neural network model is used to classify the SOZ and non-SOZ data. We compare the EEGAug method with other data augmentation methods and another method of class-balanced focal loss function, which is also used for solving the data imbalance problem by adjusting the weights between the minority and majority classes. The results show that the EEGAug method performs best in most data.Significance. Data imbalance is a widespread clinical problem. The EEGAug method can flexibly generate synthetic data for the minority class, yielding synthetic and raw data with a high distribution similarity. By using the EEGAug method, clinical data can be used in machine learning models

Apollo (Cambridge)

Initialisation of Nonlinearities for PNL and Wiener systems Inversion

Author: Christian Jutten
Dinh-tuan Pham
Jordi Sole-casals
Publication venue
Publication date
Field of study

Abstract. This paper proposes a very fast method for blindly initializing a nonlinear mapping which transforms a sum of random variables. The method provides a surprisingly good approximation even when the basic assumption is not fully satisfied. The method can been used successfully for initializing nonlinearity in post-nonlinear mixtures or in Wiener system inversion, for improving algorithm speed and convergence.

CiteSeerX

Identifying health status of wind turbines by using self organizing maps and interpretation-oriented post-processing tools

Author: Blanco Alejandro
Cusidó Roura Jordi
Gibert Karina
Marti Puig Pere
Sole Casals Jordi
Publication venue
Publication date
Field of study

RECERCAT